-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
port affine tests #7708
port affine tests #7708
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7708
Note: Links to docs will display an error until the docs builds have been completed. ❌ 5 New FailuresAs of commit 980e77b: NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@@ -162,7 +165,7 @@ def _check_dispatcher_dispatch(dispatcher, kernel, input, *args, **kwargs): | |||
if isinstance(input, datapoints._datapoint.Datapoint): | |||
# Due to our complex dispatch architecture for datapoints, we cannot spy on the kernel directly, | |||
# but rather have to patch the `Datapoint.__F` attribute to contain the spied on kernel. | |||
spy = mock.MagicMock(wraps=kernel) | |||
spy = mock.MagicMock(wraps=kernel, name=kernel.__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QoL
("dtype", "make_mask"), [(torch.uint8, make_segmentation_mask), (torch.bool, make_detection_mask)] | ||
) | ||
def test_kernel_mask(self, dtype_and_make_mask): | ||
dtype, make_mask = dtype_and_make_mask | ||
def test_kernel_mask(self, dtype, make_mask): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QoL
@@ -744,7 +746,7 @@ def _make_input(self, input_type, *, dtype=None, device="cpu", spatial_size=(17, | |||
@pytest.mark.parametrize("dtype", [torch.float32, torch.uint8]) | |||
@pytest.mark.parametrize("device", cpu_and_cuda()) | |||
def test_kernel_image_tensor(self, dtype, device): | |||
check_kernel(F.horizontal_flip_image_tensor, self._make_input(torch.Tensor)) | |||
check_kernel(F.horizontal_flip_image_tensor, self._make_input(torch.Tensor, dtype=dtype, device=device)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't spot that before
(torch.Tensor, F.horizontal_flip_image_tensor), | ||
(PIL.Image.Image, F.horizontal_flip_image_pil), | ||
(datapoints.Image, F.horizontal_flip_image_tensor), | ||
(datapoints.BoundingBox, F.horizontal_flip_bounding_box), | ||
(datapoints.Mask, F.horizontal_flip_mask), | ||
(datapoints.Video, F.horizontal_flip_video), | ||
], | ||
) | ||
def test_dispatcher_signature(self, kernel, input_type): | ||
check_dispatcher_signatures_match(F.resize, kernel=kernel, input_type=input_type) | ||
check_dispatcher_signatures_match(F.horizontal_flip, kernel=kernel, input_type=input_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to pay more attention when copy-pasting / reviewing. This went undetected.
|
||
return input | ||
|
||
def _adapt_fill(self, value, *, dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will likely need to be factored out later, since we need this for every transform that supports fill
.
@@ -949,18 +878,6 @@ def test_correctness_crop_bounding_box(device, format, top, left, height, width, | |||
torch.testing.assert_close(output_spatial_size, spatial_size) | |||
|
|||
|
|||
@pytest.mark.parametrize("device", cpu_and_cuda()) | |||
def test_correctness_horizontal_flip_segmentation_mask_on_fixed_input(device): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot about this last time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The affine tests run in 1.5 minutes on my laptop. Conversely, the Resize tests take 5 seconds (I understand there's less combinations to check there).
Can we do anything to reduce the duration of these tests?
A significant portion of these tests are tests that:
I'm thinking in particular about tests like |
Regarding the long test time for this class:
Second to that is
If we can bring both tests down to a more reasonable duration, the total time of
To achieve that, we could move away from the full matrix of parameters. Meaning, we will fix most of the parameters and only parametrize over one or a few at the same time. This will drastically cut down the overall amount of tests, while only leaving minimal gaps that will likely never result in a bug. Note that we are not talking about the correctness tests here. I'll send a commit to implement this strategy. None of that is orthogonal to running test on a schedule. However, implementing that will take more time and thus I'll go for the low hanging fruit first. |
After 39b91c8
I can live with that without going for a schedule solution for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Philip, a few comments but LGTM anyway
) | ||
@pytest.mark.parametrize("fill", _CORRECTNESS_FILL) | ||
@pytest.mark.parametrize("seed", list(range(10))) | ||
def test_transform_image_correctness(self, center, interpolation, fill, seed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can't unify this test with the functional one, because the parameters are too different (i.e. they're randomly sampled in the transform)?
That being said I'm curious why the MAE threshold are different in both test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can't unify this test with the functional one, because the parameters are too different (i.e. they're randomly sampled in the transform)?
Correct.
That being said I'm curious why the MAE threshold are different in both test
I just put the minimal needed threshold for the tests to pass. I guess in our case transform test sampled other affine params that ultimately led to a higher MAE. I'll unify them.
kwargs["degrees"] = 0 | ||
|
||
with pytest.raises( | ||
ValueError if isinstance(value, list) else TypeError, match=f"{param} should be a sequence of length 2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not worth addressing in this PR because it's not related to tests, but do you have any thought on unifying or parameter error types to always be ValueError? Distinguishing between ValueError and TypeError doesn't seem worth it and leads to that kind of special-casing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started out with the strong believe that one should differentiate between the two based on the documentation:
Passing arguments of the wrong type (e.g. passing a list when an int is expected) should result in a TypeError, but passing arguments with the wrong value (e.g. a number outside expected boundaries) should result in a ValueError.
However, in practice it turns out that this distinction is not really helpful. I doubt anyone is doing
try:
torchvision.transforms.functional.affine(...)
except TypeError:
# do something
except ValueError:
# do something else
The most likely scenario is that no one is using the exceptions in their regular flow, but rather only if there is actual an error. And in that case a human is going to look into it and thus the message is much more important.
So overall I see few reasons to differentiate the two in torchvision
. That might be different for other packages, where the users actually rely on the exception behavior. However, it doesn't bother me enough to actually go and address it. I'll review / stamp a PR if you want to bring one.
As a side note: the distinction between TypeError
and ValueError
here is terrible IMO, because we even use the same error message.
Hey @pmeier! You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py |
Reviewed By: vmoens Differential Revision: D47186575 fbshipit-source-id: 93dcb9c8c60dc702870dfe6cc5c0d2a2a178a77c
cc @vfdev-5